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Introduction 


Serum  protein  profiling  using  mass  spectrometry  is  a  promising  approach  to  identify  novel  circulating  breast 
cancer  markers.  One  of  the  major  problems  with  detecting  low-abundance  proteins  in  the  serum  is  that  they  are 
frequently  masked  by  large,  abundant  proteins  such  as  albumin  and  immunoglobulins  among  others.  Therefore, 
serum  protein  fractionation  is  an  important  consideration.  After  fractionation,  protein  profiles  can  be  detected 
using  mass  spectrometry.  Surface-enhanced  laser  desorption  ionization  time-of-flight  (SELDI-TOF)  has  been 
used  to  compare  protein  profiling  of  serum  from  healthy  individuals  and  cancer  patients.  However,  SELDI-TOF 
only  yields  mass/charge  (effectively  molecular  weight)  information  and  no  protein  identification.  Alternatively, 
fractionated  serum  proteins  can  be  analyzed  after  protease  digestion  using  liquid  chromatography  mass 
spectrometry  (LC-MS),  and  the  LC-MS  profiles  can  then  be  compared  to  develop  diagnostic  fingerprints  using 
bioinformatic  techniques.  Differentially  regulated  peptides  can  then  be  identified  by  MS/MS,  allowing 
verification  and  antibody-based  diagnostics  to  be  developed. 

Body 

Thirty  serum  samples  from  healthy  women  and  breast  cancer  patients  at  different  stages  were  fractionated  using 
two  separate  antibody  columns  to  remove  highly  abundant  proteins.  Samples  were  randomized  prior  to 
fractionation  and  mass  spectrometry  testing.  Briefly,  20  microliters  of  serum  were  diluted  and  injected  through  a 
Seppro  column  and  an  Agilent  column  in  tandem  using  appropriate  buffers.  Each  fraction  was  digested  with 
trypsin  and  subsequently  analyzed  by  LC-MS.  Rather  than  using  bioinformatic  analysis  as  a  pattern-matching 
technique,  peptides  were  targeted  based  on  the  disease  to  control  peak  intensity  ratios  measured  in  the  averages 
of  all  mass  spectra  in  each  group  and  t-tests  of  the  intensity  of  each  individual  peak.  A  series  of  preprocessing 
steps  were  employed  to  produce  an  expansive  list  of  peptides  for  further  investigation  and  sequencing.  These 
steps  included  spectral  alignment,  baseline  subtraction,  normalization,  identifying  of  local  maxima,  further 
identifying  "large"  maxima  as  peaks,  and  looking  for  signs  of  differential  expression  (Koomen,  et.  Al,  2005). 

Serum  samples  were  obtained  under  protocol  LAB02277  (UTMDACC)  with  appropriate  consent  forms  on  file, 
aliquoted,  and  stored  frozen  at  -80.  Aliquots  (20  ul)  from  each  were  separately  thawed,  diluted  5x  in  TBS  (20 
mM  pH7.6)  and  injected  onto  the  depletion  columns  (Agilent-6,  Seppro- 12)  in  tandem  flowing  at  200  ul  per 
minute  in  TBS.  The  effluent  was  monitored  at  280  ran  and  the  flowthrough  was  collected.  The  affinity  column 
system  was  flushed  with  loading  buffer,  regenerated  with  500  mM  Glycine-HCl  pH2.0  in  TBS  and 
reequilibrated  in  TBS  for  the  next  sample  injection.  Pilot  experiments  indicated  sample  carryover  under  these 
conditions  was  essentially  undetectable.  The  above  flowthrough  was  acetone-precipitated  by  adding  6  volumes 
of  cold  (-20)  acetone  and  standing  at  -20  overnight.  The  liquid  was  carefully  decanted,  the  pellet  was  washed 
once  with  cold  (-20)  acetone,  and  the  pellet  air-dried  for  several  minutes.  To  this  500  ug  trypsin  (sequencing 
grade,  Promega)  was  added  in  50  ul  30  mM  ammonium  bicarbonate  and  the  digestion  proceeded  for  8  hours  at 
37C,  after  which  an  additional  500  ug  trypsin  was  added  and  incubated  overnight.  The  digestion  was  quenched 
by  the  addition  of  acid,  and  5  ul  injected  on  the  LCMS  for  profiling. 

LCMS  was  performed  using  a  capillary  HPLC  (Agilent  1100  capillary)  connected  to  an  ESI-TOF  mass 
spectrometer  using  a  nanoflow  interface  (Mariner,  Applied  Biosystems).  The  separation  was  performed  on  a 
0.150  mm  IDxl5cmC18  reversed-phase  column  (Cl  8-  MS,  Grace-Vydac)  flowing  at  1  uL/min.  Samples 
were  injected  at  97%  A  (2%  acetonitrile  in  water  containing  0.01%  trifluoroacetic  acid),  and  salts  flushed  out 
for  40  minutes.  Then  the  mass  spectral  acquisition  was  started  with  the  gradient  start,  proceeding  to  50%  B 
(80%  acqueous  acetonitrile  containing  0.01%  triflouroacetic  acid)  over  40  minutes,  then  ramping  up  to  90%  B 
over  5  minutes.  After  flushing  at  90%  the  column  was  reequilibrated  in  initial  conditions,  and  two  blank 
gradients  were  performed  to  reduce  the  possibility  of  peptide  carryover  into  the  next  run.  Preliminary 
experiments  indicated  this  protocol  was  more  than  sufficient  for  this  purpose. 
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Mass  spectra  were  acquired  as  the  sum  of  20  seconds  of  elution  time  per  spectrum  over  the  course  of  the  90 
minute  run,  resulting  in  about  270  spectra  per  sample.  A  heat  map  of  the  LCMS  one  of  the  samples  is  shown  in 
figure  1  (upper  panel),  above.  The  corresponding  total  ion  chromatogram  (TIC)  is  also  shown  in  figure  1 
(lower  panel).  We  found  there  was  some  variation  in  the  retention  times  of  several  major  peptide  signals,  so  we 
adjusted  the  time  coordinates  slightly  based  on  apparent  retention  times  of  a  number  of  peaks  identified  as 
originating  from  an  abundant  protein,  complement  3.  We  then  calculated  the  offsets  in  various  regions  of  the 
chromatogram,  and  performed  a  piece-wise  adjustment  to  the  apparent  retention  times  for  each  run.  An 
example  of  the  adjustment  is  illustrated  in  figure  2,  below. 


Figure  2.  Adjusting  the  retention  time  in  the  neighborhood  of  the  1385.3  peak.  Unadjusted  data  for  mass 
1385.3  across  the  sample  set  is  on  the  left,  the  right  panel  shows  the  result  of  the  time  adjustment. 


Our  early  analysis  of  these  data  generated  lists  of  peaks  that  appeared  to  be  up-  or  down-  regulated  based  on 
their  t-scores.  One  of  them,  expected  to  have  a  molecular  weight  of  1339.7  was  found  in  sample  number  48.  A 
portion  of  this  digest  was  then  fractionated  and  analyzed  by  LC-MALDI-MS/MS  (Dionex-LCPackings  HPLC 
with  Probot  plate  spotting  robot,  Applied  Biosystems  4700  Proteomics  Analyzer).  Approximately  50  proteins 
were  identified  in  this  experiment  with  reasonable  confidence  levels.  Of  these,  one  of  the  proteins  found  was 
Protein  S.  This  protein  was  identified  on  the  basis  of  a  single  peptide  match,  which  had  the  correct  MH+ 
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(1340,8)  corresponding  to  the  Mr  of  1339.7.  The  match  score  using  the  search-engine  Mascot  was  69,  normally 
a  very  good  score.  The  spectrum  match  generated  by  Mascot  is  shown  in  figure  3. 


i scjence)  Mascot  Search  Results 


Peptide  View 

MS/MS  Fragmentation  of  IETISHEDLQR 
Found  in  gi|36579,  preproprotein  S  [Homo  sapiens] 

Match  to  Query  428  (1340.80,1+)  MaldiWelllD:  21866,  SpectrumlD:  65190, 


Click  mouse  within  plot  area  to  zoom  in  by  factor  of  two  about  that  point 
Or,  Plot  from  |o  j  to  |l200  ]  Da 


CO 

•w 


Monoisotopic  mass  of  neutral  peptide  (Mr):  1339.67 

Figure  3.  Centroided  spectrum  match  output  from  Mascot  for  the  target  peptide  at  MH+=1 340.8.  The  score  for 
this  match  was  69. 

Further  statistical  analyses  of  the  dataset  subsequently  revealed  that  the  number  of  dysregulated  peaks  we  found 
was  actually  no  greater  than  the  number  expected  by  pure  chance.  We  believe  that  the  depletion  experiment 
removed  most  of  the  proteins  often  found  to  be  significantly  dysregulated  in  such  experiments,  such  as 
haptoglobin  and  serum  amyloid.  Apparently  the  next  level  of  proteins  detectable  by  these  methods  are  not 
sufficiently  perturbed  to  be  found  from  the  noise  in  this  system.  We  are  now  pursuing  some  next-generation 
strategies  to  overcome  this  problem. 

Key  Research  Accomplishments 

The  Seppro  and  Agilent  antibody  columns  removed  12  of  the  most  abundant  proteins  in  serum,  including 
albumin,  IgG,  Fibrinogen,  Transferrin,  IgA,  IgM,  al -Antitrypsin,  Haptoglobin,  al-Acid  Glycoprotein,  a2- 
Macroglobulin  and  HDL  (Apo lipoproteins  A-I  and  A-II).  Using  LC-MS  and  bioinformatic  analysis  we  found  17 
differentially  expressed  peaks  in  the  Cancer  vs.  Healthy  groups;  28  differentially  expressed  peaks  in  the  Stage  3 
vs.  Healthy  groups;  and  36  differentially  expressed  peaks  in  the  Stage  4  vs.  Healthy  groups.  Efforts  are  ongoing 
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•  to  identify  targeted  peptide  ion  signals  using  tandem  matrix-assisted  laser  desorption/ionization  mass 
spectrometry  (MALDI-MS/MS).  One  peak  indicated  to  be  up-regulated  in  cancer  by  the  initial  bioinformatics 
analysis  was  identified  as  a  peptide  from  Protein  S,  but  it  is  statistically  unconvincing.  Further  improvements 
are  required  to  find  convincing  biomarker  candidates. 

Reportable  Outcomes 

None 

Conclusions 

Serum  fractionation  using  specific  antibody  columns  followed  by  LC-MS  and  bioinformatic  analysis  may  be  a 
feasible  approach  to  peptide  profiling  in  healthy  women  and  breast  cancer  patients.  A  key  advantage  is  that 
detected  changes  can  be  identified  by  ms/ms  of  the  target  peptides.  A  disadvantage  compared  with  the  SELDI 
experiment  is  that  each  samples  produces  about  100  times  more  data  per  sample  to  process.  Still,  further 
improvements  in  processing  analysis  appear  to  be  necessary  to  produce  convincing  candidate  biomarkers. 
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